78 research outputs found
Machine learning cosmological structure formation
We train a machine learning algorithm to learn cosmological structure
formation from N-body simulations. The algorithm infers the relationship
between the initial conditions and the final dark matter haloes, without the
need to introduce approximate halo collapse models. We gain insights into the
physics driving halo formation by evaluating the predictive performance of the
algorithm when provided with different types of information about the local
environment around dark matter particles. The algorithm learns to predict
whether or not dark matter particles will end up in haloes of a given mass
range, based on spherical overdensities. We show that the resulting predictions
match those of spherical collapse approximations such as extended
Press-Schechter theory. Additional information on the shape of the local
gravitational potential is not able to improve halo collapse predictions; the
linear density field contains sufficient information for the algorithm to also
reproduce ellipsoidal collapse predictions based on the Sheth-Tormen model. We
investigate the algorithm's performance in terms of halo mass and radial
position and perform blind analyses on independent initial conditions
realisations to demonstrate the generality of our results.Comment: 10 pages, 7 figures. Minor changes to match version published in
MNRAS. Accepted on 22/06/201
Extending BEAMS to incorporate correlated systematic uncertainties
New supernova surveys such as the Dark Energy Survey, Pan-STARRS and the LSST
will produce an unprecedented number of photometric supernova candidates, most
with no spectroscopic data. Avoiding biases in cosmological parameters due to
the resulting inevitable contamination from non-Ia supernovae can be achieved
with the BEAMS formalism, allowing for fully photometric supernova cosmology
studies. Here we extend BEAMS to deal with the case in which the supernovae are
correlated by systematic uncertainties. The analytical form of the full BEAMS
posterior requires evaluating 2^N terms, where N is the number of supernova
candidates. This `exponential catastrophe' is computationally unfeasible even
for N of order 100. We circumvent the exponential catastrophe by marginalising
numerically instead of analytically over the possible supernova types: we
augment the cosmological parameters with nuisance parameters describing the
covariance matrix and the types of all the supernovae, \tau_i, that we include
in our MCMC analysis. We show that this method deals well even with large,
unknown systematic uncertainties without a major increase in computational
time, whereas ignoring the correlations can lead to significant biases and
incorrect credible contours. We then compare the numerical marginalisation
technique with a perturbative expansion of the posterior based on the insight
that future surveys will have exquisite light curves and hence the probability
that a given candidate is a Type Ia will be close to unity or zero, for most
objects. Although this perturbative approach changes computation of the
posterior from a 2^N problem into an N^2 or N^3 one, we show that it leads to
biases in general through a small number of misclassifications, implying that
numerical marginalisation is superior.Comment: Resubmitted under married name Lochner (formally Knights). Version 3:
major changes, including a large scale analysis with thousands of MCMC
chains. Matches version published in JCAP. 23 pages, 8 figure
Towards the Future of Supernova Cosmology
For future surveys, spectroscopic follow-up for all supernovae will be
extremely difficult. However, one can use light curve fitters, to obtain the
probability that an object is a Type Ia. One may consider applying a
probability cut to the data, but we show that the resulting non-Ia
contamination can lead to biases in the estimation of cosmological parameters.
A different method, which allows the use of the full dataset and results in
unbiased cosmological parameter estimation, is Bayesian Estimation Applied to
Multiple Species (BEAMS). BEAMS is a Bayesian approach to the problem which
includes the uncertainty in the types in the evaluation of the posterior. Here
we outline the theory of BEAMS and demonstrate its effectiveness using both
simulated datasets and SDSS-II data. We also show that it is possible to use
BEAMS if the data are correlated, by introducing a numerical marginalisation
over the types of the objects. This is largely a pedagogical introduction to
BEAMS with references to the main BEAMS papers.Comment: Replaced under married name Lochner (formally Knights). 3 pages, 2
figures. To appear in the Proceedings of 13th Marcel Grossmann Meeting
(MG13), Stockholm, Sweden, 1-7 July 201
Astronomaly at Scale: Searching for Anomalies Amongst 4 Million Galaxies
Modern astronomical surveys are producing datasets of unprecedented size and
richness, increasing the potential for high-impact scientific discovery. This
possibility, coupled with the challenge of exploring a large number of sources,
has led to the development of novel machine-learning-based anomaly detection
approaches, such as Astronomaly. For the first time, we test the scalability of
Astronomaly by applying it to almost 4 million images of galaxies from the Dark
Energy Camera Legacy Survey. We use a trained deep learning algorithm to learn
useful representations of the images and pass these to the anomaly detection
algorithm isolation forest, coupled with Astronomaly's active learning method,
to discover interesting sources. We find that data selection criteria have a
significant impact on the trade-off between finding rare sources such as strong
lenses and introducing artefacts into the dataset. We demonstrate that active
learning is required to identify the most interesting sources and reduce
artefacts, while anomaly detection methods alone are insufficient. Using
Astronomaly, we find 1635 anomalies among the top 2000 sources in the dataset
after applying active learning, including 8 strong gravitational lens
candidates, 1609 galaxy merger candidates, and 18 previously unidentified
sources exhibiting highly unusual morphology. Our results show that by
leveraging the human-machine interface, Astronomaly is able to rapidly identify
sources of scientific interest even in large datasets.Comment: 15 pages, 9 figures. Comments welcome, especially suggestions about
the anomalous source
New applications of statistics in astronomy and cosmology
Includes bibliographical references.Over the last few decades, astronomy and cosmology have become data-driven fields. The parallel increase in computational power has naturally lead to the adoption of more sophisticated statistical techniques for data analysis in these fields, and in particular, Bayesian methods. As the next generation of instruments comes online, this trend should be continued since previously ignored effects must be considered rigorously in order to avoid biases and incorrect scientific conclusions being drawn from the ever-improving data. In the context of supernova cosmology, an example of this is the challenge from contamination as supernova datasets will become too large to spectroscopically confirm the types of all objects. The technique known as BEAMS (Bayesian Estimation Applied to Multiple Species) handles this contamination with a fully Bayesian mixture model approach, which allows unbiased estimates of the cosmological parameters. Here, we extend the original BEAMS formalism to deal with correlated systematics in supernovae data, which we test extensively on thousands of simulated datasets using numerical marginalization and Markov Chain Monte Carlo (MCMC) sampling over the unknown type of the supernova, showing that it recovers unbiased cosmological parameters with good coverage. We then apply Bayesian statistics to the field of radio interferometry. This is particularly relevant in light of the SKA telescope, where the data will be of such high quantity and quality that current techniques will not be adequate to fully exploit it. We show that the current approach to deconvolution of radio interferometric data is susceptible to biases induced by ignored and unknown instrumental effects such as pointing errors, which in general are correlated with the science parameters. We develop an alternative approach - Bayesian Inference for Radio Observations (BIRO) - which is able to determine the joint posterior for all scientific and instrumental parameters. We test BIRO on several simulated datasets and show that it is superior to the standard CLEAN and source extraction algorithms. BIRO fits all parameters simultaneously while providing unbiased estimates - and errors - for the noise, beam width, pointing errors and the fluxes and shapes of the sources
- …